Asymptotic Approximation of 
Marginal Likelihood Integrals 



Shaowei Lin 



Abstract 

The accurate asymptotic evaluation of marginal likelihood integrals is 
a fundamental problem in Bayesian statistics. Following the approach in- 
troduced by Watanabe, we translate this into a problem of computational 
algebraic geometry, namely, to determine the real log canonical threshold 
of a polynomial ideal, and we present effective methods for solving this 
problem. Our results are based on resolution of singularities, and they 
apply to all statistical models for discrete data that admit a parametriza- 
tion by real analytic functions. 
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1 Introduction 

The evaluation of marginal likelihood integrals is essential in model selection and 
has important applications in areas such as machine learning and computational 
biology. The exact evaluation of such integrals is a difficult problem [TJHH] and 
classical approximation formulas usually apply only for smooth models. Recent 
work by Watanabe and his collaborators [T1 I24H27] extended these formulas to a 
broad class of models with singularities. His work also uncovered interesting con- 
nections with resolution of singularities in algebraic geometry. The goal of this 
paper is to systematically study the algebraic geometry behind Watanabe's for- 
mulas, and to develop symbolic algebra tools which allow the user to accurately 
evaluate the asymptotics of integrals in Bayesian statistics. 

Watanabe showed that the key to understanding a singular model is monomi- 
alizing the Kullback-Leibler function K(ui) of the model at the true distribution. 
While general algorithms exist for monomializing any analytic function , ap- 
plying them to non-polynomial functions such as K(u) can be computationally 
expensive. In practice, many singular models arc parametrized by polynomials. 
Therefore, it is natural to ask if this polynomiality can be exploited in the analy- 
sis of such models. In this paper, we explore this question for discrete statistical 
models. Our point of departure is to describe the asymptotics of the likelihood 
integral by the real log canonical threshold of an ideal in a polynomial ring. For 
generality, our results will instead be proved for rings of analytic functions. 
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Consider a statistical model /oaa finite discrete space [k] = {1,2, ... ,k} 
parametrized by a real analytic map p : — > A/j_i where Q is a compact subset 
of M d and A^_i is the probability simplex {x £ R k : Xi > 0, J2 x i = !}• We 
assume that f2 is semianalytic, i.e. = {a; £ R d : <7i(a;) > 0, ... ,gi(x) > 0} is 
defined by real analytic inequalities. Let g £ Afc_i be a point in the model with 
non-zero entries. Suppose a sample of size N is drawn from the true distribution 
q, and let U = (Ui) denote the vector of relative frequencies for this sample. Let 
ip : O —> M be nearly analytic, i.e. tp is a product (pa^s °f functions where ip a is 
real analytic and ip s is positive and smooth. Consider a Bayesian prior defined 
by \ip\. We want to study the asymptotic behavior, as the sample size N grows 
large, of the marginal likelihood integral 

k 

Z(N)= /jJ ft ( W )^|v(w)|dw- (1) 
Jn i=1 

The first few terms of the asymptotics of the log likelihood integral log Z(N) 
was derived by Watanabe. To state his result, we first recall that the Kullback- 
Leibler distance K(u>) between q and p(u) is 

fc 

K (u) = Vg 4 log-^— . 

This function satisfies K(w) > with equality if and only if p(ui) = q. 
Theorem 1.1 (Watanabe [551 §6])- Asymptotically as N — > oo. 
fc 

log Z{N) = Nj2Uilogqi-\logN + {6-l)loglogN + m (2) 

i=l 

where the positive rational number X is the smallest pole of the zeta function 

C(z)= [ K(u)-*\ip(oj)\dLJ, zeC, (3) 
Jn 

9 is its multiplicity, and tjn is a random variable whose expectation E^jv] con- 
verges to a constant. 

Here, A is known as the learning coefficient of the model at the distribution q. 
Because formula ([2]) generalizes the Bayesian information criterion [TTJ[25], the 
numbers A and 8 arc important in model selection. Indeed, the BIC corresponds 
to the case (A, 9) = for smooth models. In algebraic geometry, A is also 

known as the real log canonical threshold [5U] of K, a term that is motivated by 
the more familiar complex log canonical threshold (see Remark 13. II) . 

These thresholds may be defined for ideals in rings of real- valued analytic 
functions as well. Given an ideal I = (fi, . . . , f r ) generated by functions /, which 
are real analytic on a compact subset C M. d and given a smooth amplitude 
function ip : M. d — >• R, we consider the zeta function 

C(z) - J (AM 2 + • • • + / r M 2 )^ /2 |^MI du. (4) 
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If if is nearly analytic, f (z) has an analytic continuation to the whole complex 
plane. Its poles are positive rational numbers with a smallest element A which 
we call the real log canonical threshold of I with respect to f over Q. Let 9 be the 
multiplicity of A as a pole of ((z) and define RLCTq(J; if) to be the pair (A, 9). 
This pair does not depend on the choice of generators fi,...,f T for /. In the 
literature, real log canonical thresholds of ideals are not well-investigated [2"0] . 
For this reason, we formally prove many of its properties in Section [3] 

With these definitions on hand, we now state our first main theorem. This 
result expresses the learning coefficient and its multiplicity directly in terms of 
the functions pi, . . . ,pk parametrizing the model. Geometrically, it says that the 
learning coefficient is the real log canonical threshold of the fiber p^ 1 {q) C fi. 
The theorem is computationally very useful especially when the pt are polyno- 
mials or rational functions, and certain special cases have been applied by Sumio 
Watanabe and his collaborators [26,27 . Our proof in Section [3] was inspired by 
a discussion with him. Now, recall that ip = <f a fs is nearly analytic. 

Theorem 1.2. Let A he the learning coefficient of the statistical model ^# and 
9 its multiplicity. Let I = (p(uj) —q):= (pi{uj) — qi,... ,pk(io) — qk) be the ideal 
of the fiber V = p~ 1 {q) = {lu E fl : p{uj) = q] ■ Then, 

(2X,9) =min RLCT^ (I; if a ) 

where each fl x is a sufficiently small neighborhood of x in f2. 

To prove this theorem and other properties of real log canonical thresholds, 
we recall Hironaka's theorem on the resolution of singularities [14] and develop 
useful lemmas in Section [2] Our treatment differs from that of Watanabe [25] in 
the following way: we study the local behavior of real log canonical thresholds 
at points x in the parameter space f2. In particular, we will be interested in the 
case where x is on the boundary d£l. Example 12.71 is an illustration of how the 
threshold is affected by the inequalities <?i > which are active at x. This issue 
can be critical in singular model selection because the parameter space of one 
model is often contained in the boundary of another that is more complex. 

After studying the local thresholds, we then show that the real log canonical 
threshold globally over Q is the minimum of local thresholds at points x in Q. 
Identifying where these minimum thresholds occur is by itself a difficult problem 
which we discuss in Section[2] As a consequence of our results, we write down ex- 
plicit formulas for the coefficients in asymptotic expansions of Laplace integrals. 
Our formulas extend those of Arnol'd-Gusein-Zade-Varchenko [5] because they 
apply also to parameter spaces with boundary. Using this expansion to improve 
approximations of likelihood integrals will be the subject of future work. 

Our next aim is to develop tools for computing or bounding real log canoni- 
cal thresholds of ideals. Section [3] summarizes useful fundamental properties of 
real log canonical thresholds. In Section [U we derive local thresholds in nondc- 
generate cases using an important tool from toric geometry involving Newton 
polyhedra. This method was invented by Varchenko [22j and applied to statis- 
tical models by Watanabe and Yamazaki [27] . Their formulas were defined for 
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functions, but we develop extensions of these formulas for ideals. We introduce 
a new notion of nondegeneracy for ideals, known as sos-nondegeneracy, and give 
the following bound for the real log canonical threshold of an ideal with respect 
to a monomial amplitude function ui T := uj^ 1 ■ ■ ■ Lj T d d . These monomial functions 
occur frequently when we apply a change of variables to resolve the singularities 
in a model. Newton polyhedra and their r-distances are defined in Section @] 

Theorem 1.3. Let I be a finitely generated ideal in the ring of functions which 
are real analytic on f2, and suppose the origin lies in the interior ofQ. Then, 
for every sufficiently small neighborhood f^o of the origin, 

RLCT no (/;w T ) < (l/l T ,e T ) 

where l T is the t -distance of the Newton polyhedron T'(I) and 9 T its multiplicity. 
Equality occurs when I is monomial or, more generally, sos-nondegenerate. 

This theorem has two main consequences. Firstly, it tells us that the real log 
canonical threshold of an ideal can be computed by finding a change of variables 
which monomializcs the ideal. Secondly, due to Theorems 11.11 and 11.21 upper 
bounds on real log canonical thresholds translate to asymptotic lower bounds on 
the likelihood integral of a statistical model, which in turn give upper bounds 
on the stochastic complexity of the model. 

Currently, there are no programs for computing real log canonical thresholds. 
There are applications which compute resolutions of singularities, but our statis- 
tical problems are too big for them. We hope that our work is a step in bridging 
the gap. Some of our tools are implemented in a Singular library at 

http : //math . berkeley . edu/~shaowei/ rlct . html 

This library computes the Newton polyhedron of an ideal, computes r-distances, 
and checks if an ideal is sos-nondegenerate. Instructions and examples on using 
the library may be found at the above website. 

In summary, the learning coefficient of a statistical model is a useful measure 
of the model complexity and plays an important role in model selection. Because 
computing this coefficient often requires careful analysis of the Kullback-Lciblcr 
function, we propose an idcal-thcorctic approach to make this calculation more 
tractable. This method has several advantages. Firstly, it directly exploits poly- 
nomiality in the model paramctrization. Second, the real log canonical threshold 
of an ideal is independent of the choice of generators, and this choice provides 
flexibility to our computations. Thirdly, it is easier to construct Newton polyhe- 
dra for polynomial ideals and to check their nondegeneracy f Proposition ^. 2f 3)), 
than for nonpolynomial Kullback-Leibler functions. We demonstrate these ideas 
in Section [5] by computing the learning coefficients of a discrete mixture model 
which comes from a study involving 132 schizophrenic patients. 

To introduce some notation, given x € M. d , let A x (M. d ) be the ring of real- 
valued functions / : M. d — > M that are analytic at x. We sometimes shorten the 
notation to A x when it is clear that we are working with the space M. d . When 
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x = 0, it is convenient of think of Aq as a subring of the formal power series 
ring R[[wi, . . . , wj] = It consists of power series which are convergent in 

some neighborhood of the origin. For all x, A x ~ Aq by translation. Given a 
subset CI C M. d , let An be the ring of real functions analytic at each point x G CI. 
Locally, each function can be represented as a power series centered at x. Given 
/ G An, define the analytic variety Vn(/) = {cu E CI : f(u) = 0} while for an 
ideal / C An, we set Vn(I) = H/ e /Vn(/). Lastly, given a finite set Scl, let 
^min S denote the number of times the minimum is attained in 5*. 



2 Resolution of Singularities 

In this section, we introduce Hironaka's theorem on resolutions of singularities. 
We derive real log canonical thresholds of monomial functions, and demonstrate 
how such resolutions allow us to find the thresholds of non-monomial functions. 
We show that the threshold of a function over a compact set is the minimum 
of local thresholds, and present an example where the threshold at a boundary 
point depend on the boundary inequalities. We discuss the problem of locating 
singularities with the smallest threshold, and end this section with formulas for 
the asymptotic expansion of a Laplace integral. 

Before we explore real log canonical thresholds of ideals, let us study those 
of functions. Given a compact subset Cl of R d , a real analytic function / G An 
and a smooth function ip : M. d — > K., consider the zeta function 

C(z)= f |/Hp \<p(u)\du, zEC. (5) 
Jn 

This function is well-defined for z G M<o- If C( z ) can be continued analytically to 
the whole complex plane C, then all its poles are isolated points in C. Moreover, 
if all its poles are real, then there exists a smallest positive pole A. Let be the 
multiplicity of this pole. The pole A is the real log canonical threshold of / with 
respect to ip over Cl. If £(z) has no poles, we set A = co and leave undefined. 
Let RLCTn(/; tp) be the pair (A, 9). By abuse of notation, we sometimes refer 
to this pair as the real log canonical threshold of /. We order these pairs such 
that (Ai, 0i ) > (A 2 , 2 ) if Ai > A 2 , or Ai = A 2 and 9 1 < 2 . Lastly, let RLCT n / 
denote RLCTq(/; 1) where 1 is the constant unit function. 

Wc start with a simple class of functions for which it is easy to compute the 
real log canonical threshold. It is the class of monomials tu^ 1 ■ ■ ■ cj^ d = uj k . 

Proposition 2.1. Let k — [k\, ■ ■ ■ , Kd) and r = (n, . . . , r<j) be vectors of non- 
negative integers. If CI is the positive orthant K> and <f) : K d — > R is compactly 
supported and smooth with cj)(0) > 0, then RLCTn(w K ; uj T (f>) — (A, 0) where 

A = min {^JL}, = # min {^-^-}. 

l<j<d Kj l<j<d Kj 

Proof. See [2] Lemma 7.3]. The idea is to express 4>{oS) as T s (ui)+R s (lu) where T s 
is the s-th degree Taylor polynomial and R s the difference. We then integrate the 
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main term \f\~ z T s explicitly and show that the integral of the remaining term 
|/| _z i? s does not have larger poles. This process gives the analytic continuation 
of C,{z) to the whole complex plane, so we have the Laurent expansion 



where the poles a are positive rational numbers and P(z) is a polynomial. □ 

For non-monomial f(u>), Hironaka's celebrated theorem |14j on the resolution 
of singularities tells us that wc can always reduce to the monomial case. 

Theorem 2.2 (Resolution of Singularities). Let f be a non-constant real ana- 
lytic function in some neighborhood ft C M. d of the origin with /(0) = 0. Then, 
there exists a triple {M, W, p) where 

a. W C SI is a neighborhood of the origin, 

b. M is a d- dimensional real analytic manifold, 

c. p : M —^W is a real analytic map 
satisfying the following properties. 

i. p is proper, i.e. the inverse image of any compact set is compact. 

ii. p is a real analytic isomorphism between M \ Vm(/ ° p) and W \ Vw(/). 

Hi. For any y £ Vm(/ ° p), there exists a local chart M y with coordinates 
p = (pi,P2, ■ ■ ■ Pd) such that y is the origin and 



where K\, K2, ■ ■ ■ , Kd are non-negative integers and a is a real analytic func- 
tion with a{p) ^ for all p. Furthermore, the Jacobian determinant equals 



where t\, t%, ■ ■ ■ , t<j are non-negative integers and h is a real analytic func- 
tion with h(p) 7^ for all p. 

We say that (M, W, p) is a resolution of singularities or a desingularization 
of / at the origin. The set of points in M where p is not one-to-one is the excep- 
tional divisor. Now, let us desingularize a list of functions simultaneously. 

Corollary 2.3 (Simultaneous Resolutions). Let fi,...,fi be non-constant real 
analytic functions in some neighborhood fl C K d of the origin with all fi(0) = 0. 
Then, there exists a triple (M, W, p) that desingularizes each fi at the origin. 

Proof. The idea is to desingularize the product /i(w) • • • fi(u>) and to show that 
such a resolution of singularities is also a resolution for each fi. See [25l Thm 
11] and [HI Lemma 2.3] for details. □ 





a(p)p^p2 2 ■■■Vd" = "(m)m' 



\p'(p)\ = h(p)p\ 1 p T 2 2 ■ ■ -p T d d = h(p)p 
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For the rest of this section, let = {lo e R d ,.gi(w) > 0, . . . ,gt(cj) > 0} be 
compact and semianalytic. We also assume that f,<p£ An. 

Lemma 2.4. For each x G il, there is a neighborhood Q x of x in such that 
for all smooth functions <fi on £l x with 4>{ x ) > 0, 

RLCTo x (/; (p<p) = RLCTn^ (/; <p). 

Proof. Let x € fi. If f(x) ^ 0, then by the continuity of /, there exists a small 
neighborhood ft x where < c\ < |/(w)| < C2 for some constants c\,C2- Hence, 
for all smooth functions 0, the zeta functions 

|/(w)| Z \<p(u>)(j)(u))\duj and / |/(w)| z \(p(u>)\duj 

do not have any poles, so the lemma follows in this case. 

Suppose f(x) = 0. By Corollary |2.31 we have a simultaneous local resolution 
of singularities (M,W,p) for the functions f,(p,gi,...,gi vanishing at x. For 
each point y in the fiber we have a local chart satisfying property (iii) of 

Theorem 12.21 Since p is proper, the fiber p~ 1 (x) is compact so there is a finite 
subcover {My}. We claim that the image M y ) contains a neighborhood W x 
of x in R d . Indeed, otherwise, there exists a bounded sequence {x\,X2, ■ • ■} of 
points in W \ p((J M y ) whose limit is x. We pick a sequence {yi, j/2, • ■ •} such 
that p{yi) = Xi. Since the Xi are bounded, the yi lie in a compact set so there 
is a convergent subsequence with limit y» . The yi are not in the open set (J M y 
so nor is y*. But p(y*) = limp(yi) = i so j/, 6 p~ 1 (x) C M y , a contradiction. 

Now, define Vl x = W x (~l and let {My} be the collection of all sets Ai y = 
M y C\ p -1 (£l x ) which have positive measure. Picking a partition of unity {o~ y (p)} 
subordinate to {M. y } such that a y is positive at y for each y, we write the zeta 
function £(z) = f n \f(uj)\~ z \ip(uj)(j)(uj)\ duj as 



E 

V 



\f P (p)\ Z \(f O P (p)\\(j) O P (p)\\p' (p)\a y (p) tip. 



For each y, the boundary conditions giO p{p) > become monomial inequalities, 
so M. y is the union of orthant neighborhoods of y. The integral over A4 y is then 
the sum of integrals of the form 



(y(z) = / p~ KZ + ^(p)dp 

where n and r are non-negative integer vectors while is a compactly supported 
smooth function with ^(O) > 0. Note that k and r do not depend on <fi nor on 
the choice of orthant at y. By Proposition 12. 1[ the smallest pole of ( y (z) is 

T- + 1 T- + 1 

A v = min }, 0„ = # min }. 

l<j<d Kj l<j<d Kj 

Now, RLCTq x (/; tp<p) = min^KAj,, ^y)}. Since this formula is independent of <fi, 
we set d> — 1 and the lemma follows. □ 
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Proposition 2.5. Let 4> : f2 — > R be positive and smooth. Then, for sufficiently 
small neighborhoods Q x , the set {RLCTo^ (/; tp) : x 6 £1} has a minimum and 

RLCT (/; W) - min RLCTn T (/; <p). 

x£il 

Proof. Lemma l2T4l associates a small neighborhood to each point in the compact 
set fl, so there exists a subcover {tt x : x G S} where S is finite. Let {a x (ui)} be 
a partition of unity subordinate to this subcover. Then, 

/ \f(n)\- z \tp(u)cf>(u)\du = J2 [ |/(n)|"Xw)0(w)|o- x (o;)dw. 

From this finite sum, we have 

RLCT J2 (/; tpcfi = min RLCTq x (/; tpfa) = min RLCT^ (/; tp). 

Now, if y £ fi\ S, let f2 y be a neighborhood of y prescribed by Lemma I2T41 and 
consider the cover {il x : x <E S} U {O y } of fi. After choosing a partition of unity 
subordinate to this cover and repeating the above argument, we get 

RLCTn(/; <p4>) < RLCTo B (/; tp) for all yea 

Combining the two previously displayed equations proves the proposition. □ 

Abusing notation, we now let RLCTq x (/; tp) represent the real log canonical 
threshold for a sufficiently small neighborhood Q x of x in Q. If x is an interior 
point of f2, we denote the threshold at x by RLCT a: (/; tp). 

Corollary 2.6 (See also [25j §4.5]). Given a compact semianalytic set C R d , 
a nearly analytic function tp : Q — > R, and f £ An satisfying f(x) = for some 
ieO, the zeta function |5|) can be continued analytically to C. It has a Laurent 
expansion (0|) whose poles are positive rational numbers with a smallest element. 

Proof. The proofs of Lemma 12.41 and Proposition 12.51 outline a way to compute 
the Laurent expansion of the zeta function ([5]). □ 

Example 2.7. We now show that the threshold at a boundary point depends 
on the boundary inequalities. Consider the following two small neighborhoods 
of the origin in some larger compact set. 

0,1 = {(x, y) e M 2 : < x < y < e} 
^2 = {(x, y) e R 2 : < y < x < e] 

To compute the real log canonical threshold of the function xy 2 over these sets, 
we have the corresponding zeta functions below. 

Ci(z) = x~ z y~ 2z dxdy = -, t-, r 

^ ' Jo Jo (-z + l)(-3z + 2) 

/■£ rx £ -3z+2 

C,2{z) =11 x~ z y~ 2z dydx = - - 

V ' Jo Jo (-2z + l)(-3z + 2) 

This shows that RLCT 0l (xy 2 ) = 2/3 while RLCT f22 (xy 2 ) = 1/2. □ 



Because the real log canonical threshold over a set 17 C M. d is the minimum 
of thresholds at points x G 17, we want to know where this minimum is achieved. 
Let us study this problem topologically. Consider a locally finite collection S of 
pairwise disjoint submanifolds S C 17 such that 17 = UsesS and each S is locally 
closed, i.e. the intersection of an open and a closed subset. Let S be the closure 
of S. We say S is a stratification of 17 if SnT ^ implies 5 C T for all S,T eS. 
A stratification S of 17 is a refinement of another stratification 7" if S PI T 7^ 
implies S 1 c T for all S G 5 and T G T. 

Let the amplitude y> : 17 — > K be nearly analytic. Let • • • j S(^x,e),r be 

the connected components of the set {x G 17 : RLCTn T (/; = (A, 9)}, and let 
5 denote the collection {S(x,8),i} where we vary over all A, 9 and i. Now, define 
the order oid x f of / at a point x 6 ft to be the smallest degree of a monomial 
appearing in a series expansion of / at x. This number is independent of the 
choice of local coordinates u>i , . . . , because it is the largest k such that / 6 m^; 
where = {g E A x : g(x) = 0} is the vanishing ideal of x. Define T; jl5 . . . , T t s 
to be the connected components of the set {x G il : ord x f = 1} and let T be 
the collection {Tij} where we vary over all / and j. We conjecture the following 
relationship between S and T. It implies that the minimum real log canonical 
threshold over a set must occur at a point of highest order. 

Conjecture 2.8. The collections S and T are stratifications oftt. Furthermore, 
if the amplitude ip is a positive smooth function, then S refines T . 

Laplace integrals such as (fT]) occur frequently in physics, statistics and other 
applications. At first, the relationship between their asymptotic expansions and 
the zeta function ([3]) seems strange. The key is to write these integrals as 



Formally, Z(N) is the Laplace transform of v(t) while ((z) is its Mellin trans- 
form. Note that contrary to its name, v(t) is not strictly a function, but it can 
be defined a Schwartz distribution. Next, we study the series expansions 



Z(N) 




where v(t) is the state density function 




or Gelfand-Leray function [2j 





(7) 



a i—1 



d 




(8) 



a i—1 



(1 




(9) 



a i—1 
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where ([7]) and (jHJ) are asymptotic expansions while © is the principal part of 
the Laurent series expansion. Here, the number d of summands is the dimension 
of the parameter space fi C R d . Formulas relating the coefficients b a ^, c a ,i and 
d ai i are then deduced from the Laplace and Mcllin transforms of i"(logt)\ For 
more detailed expositions on this subject, we refer the reader to Ajnol'd-Gusern- 
Zade-Varchenko H §6-7], Watanabe [351 §4] and Greenblatt [13] . 

Using this strategy, we now give explicit formulas for the asymptotic expan- 
sion of an arbitrary Laplace integral. Our formulas generalize those of Arnol'd- 
Gusem-Zade-Varchenko [2J §6-7] because they apply also to parameter spaces fi 
with analytic boundary. Watanabe [UJ Remark 4.5] gives a similar asymptotic 
expansion for bounded parameter spaces but we derive precise relationships be- 
tween the asymptotic coefficients c a _i and the Laurent coefficients d a ^ in terms 
of derivatives of Gamma functions. 

Theorem 2.9. Let O C M. d be a compact semianalytic subset and ip : O — > K be 
nearly analytic. If f G An with fix) = for some x G f2, the Laplace integral 



n 



Z(N) = I e- N ^^\(p(uj)\doj 
has the asymptotic expansion 

^^c^iV-^logiVr 1 . (10) 

a i=l 

The a in this expansion range over positive rational numbers which are poles of 

C(z)= [ \f( u )\-\<p(w)\du> (ii) 

for any 6 > and Q$ = {uj G £1 : \f{uj)\ < 5}. The coefficients c a ^ satisfy 

c - f ^W j - (12) 

where d a j is the coefficient of (z — a)~ J in the Laurent expansion of C,{z). 
Proof. First, set 8=1. We split the integral Z(N) into two parts: 

Z(N)= / e- N \ f M\\tp((jj)\du+ f e- N \ f W\<p(u)\duj. 

J\f(w)\<i •'l/MI>i 

The second integral is bounded above by Ce~ N for some positive constant C, so 
asymptotically it goes to zero more quickly than any N~ a . For the first integral, 
we write £(z) as the Mellin transform of the state density function v(t). 

C(z)=[ \f{uj)\~ z \ip(Lj)\duj = [ r z v(t)dt. 

J\f(u)\<l JO 
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By Corollarv l2.6l ((z) has a Laurent expansion ©. Moreover, since |/(w)| < 1, 
Q(z) — > as z — > — oo so the polynomial part P(z) is identically zero. Applying 
the inverse Mellin transform to ((z), we get a series expansion ([5]) of the state 
density function v(t). Applying the Laplace transform to v(t) in turn gives the 
asymptotic expansion (JT]) of Z(N). The formulas 

J°° e - m r -i (log t y dtt*J2 (j) (-l^'r^ (a) N~ a (\ogNy 
l 

t~ z i Q_1 (log t) 1 dt = -i\(z- a)- {l+1) 



from [2 Thm 7.4] and [25j Ex 4.7] give us the relations 



d 



c*,i = (j_ J ) rW_i) ( a )6a-ij. = - (J - 1 :•!/'., i ,- 

Equation (| L2[) follows immediately. Finally, for all other values of S, we write 

f(u)\ z \(p(u)\du= \ |/(w)| z |(/j(w)|da;+ / |/(^)| *\ip(w)\du. 

n Jn s J\f(w)\>s 

The last integral does not have any poles, so the principal parts of the Laurent 
expansions of the first two integrals are the same for all 6. □ 

3 Real Log Canonical Thresholds 

In this section, we prove fundamental properties of real log canonical thresholds 
(RLCTs) which will allow us to calculate these thresholds more efficiently The 
learning coefficient of a statistical model is shown to be the RLCT of the ideal 
generated by its defining equations. 

In this section, let f2 C M. d be a compact semianalytic subset and let if : O — >• 
R be nearly analytic. Given functions fi, . . . , f r 6 An, let RLCTn(/i, ...,/,■; ip) 
be the smallest pole and multiplicity of the zeta function (j4]) . Recall that these 
pairs are ordered by the rule (Ai,#i) > (A2,^a) if Ai > A2, or Ai = A2 and 
6*i < 62- For x € fi, we define RLCTn x (/i, . . . , f r ; <p) to be the threshold for a 
sufficiently small neighborhood tt x of x in O. 

Remark 3.1. The (complex) log canonical threshold may be defined in a similar 
fashion. It is the smallest pole of the zeta function 

C(z)=J (|/r(a;)| 2 + --- + |/ r (a;)| 2 )"W 

Note that the ff have been replaced by \fi\ 2 and the exponent — z/2 is changed 
to —z. Crudely, this factor of 2 comes from the fact that C d is a real vector space 
of dimension 2d. The complex threshold is often different from the RLCT [20] . 
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From the algebraic geometry point of view, more is known about complex log 
canonical thresholds than about real log canonical thresholds. Many results in 
this paper were motivated by their complex analogs [Sl ll5ffT7] . 

Now, we give several equivalent definitions of RLCTq(/i, . . . , / r ; tp) which 
are helpful in proofs of the fundamental properties. 

Proposition 3.2. Given functions f±, . . . , f r S An such that Vn((/i, • ■ • , fr)) 
is nonempty, the pairs (A, 9) defined in the statements below are all equal. 

a. The logarithmic Laplace integral 

t r 
\ogZ{N) =log / exp(-7V^/ l H 2 )|(^H|^ 

is asymptotically -| log N + (9 - 1) log log iV + 0(1). 

b. The zeta function 

C(*)= / (E/iH 2 )~ V V(w)|dw 
Jn »=1 

/ias a smallest pole A o/ multiplicity 9. 

c. The pair (A, 0) is the minimum 

mm RLCTn <0 (/i,...,/r;y)- 

In fact, it is enough to vary x over Vq((/i, ■ • ■ , fr})- 

Proof. Item (b) is the original definition of the RLCT. The equivalence of (a) 
and (b) follows from Theorem l2.9l and that of (b) and (c) from Proposition ^. 51 
The last statement of (c) follows from the fact that the RLCT is oo for points 
x i Vn((/i, • • ■ , f r )). See also [13 Thm 7.1]. □ 

Our first property describes the effect of the boundary on the RLCT. 

Proposition 3.3. Let x be a boundary point ofil C M. . Then, for a sufficiently 
small neighborhood W of x in M. d , 

RLCTwC/;^) <RLCTn x (/;¥>). 

Proof. For a sufficiently small neighborhood tt x of x in fi, we have £l x C W, so 
the corresponding Laplace integrals satisfy Zn x (iV) < Zw{N). By Proposition 
13.21 this gives the opposite inequality on the RLCTs. □ 

If the function whose RLCT we are finding is complicated, we may replace 
it with a simpler function that bounds it. Given /, g € An, we say that / and 
g are comparable in CI if c\f < g < cif in for some c\, ci > 0. 
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Proposition 3.4 ([2SJ §7]). Given f,g S An, suppose that < cf < g in fl for 

some c > 0. Then, RLCTn(/; ip) < KLCTn(g; ip). 

Corollary 3.5. ///, g are comparable in f2, then RLCTn(/; ip) = RLCTn(g; if). 

RLCTo(/f + ■ • ■ + / r 2 ; = (A, 0) implies RLCT n (/i, ...,f r ;ip) = (2 A, 0). 
From this, it seems that we should restrict ourselves to RLCTs of single and not 
multiple functions. However, as the next proposition shows, multiple functions 
are important because they allow us to work with ideals for which different gen- 
erating sets can be chosen. This gives us freedom to switch between single and 
multiple functions in powerful ways. For instance, special cases of this proposi- 
tion such as Lemmas 3 and 4 of pQ have been used to simplify computations. 

Proposition 3.6. If two sets {/i, . . . , f r } and {c/i, . . . ,g s } of functions generate 
the same ideal I C An , then 

RLCTq(/i, ...,f r ;(p) = RLCT n (ffi, . . .,g a ;<p). 

Define this pair to be RLCTq(J; tp). 

Proof. Each gj can be written as a combination h±fi + - ■ - + h r f r of the /j where 
the hi are real analytic over fi. By the Cauchy-Schwarz inequality 

g 2 ] <{h\ + --- + hl)(fl + --- + fi). 

Because f2 is compact, the hi are bounded. Thus, summing over all the gj, there 
is some constant c > such that, 

s r 

By Proposition GOl RLCT n (si, . ..,g r ;<p)< RLCTn(/i, ...,f r ;<p) and by sym- 
metry, the reverse is also true, so we are done. See also [20l §2.6]. □ 

For the next result, let f\ , . . . , f r 6 Ax and g\ , . . . , g s G Ay where X C K m 
and Y C W 1 are compact semianalytic subsets. This occurs, for instance, when 
the fi and gj are polynomials with disjoint sets of indeterminates {x\, . . . , x m } 
and {yi, . . . ,y n }. Let <p x : X — » E and <p y : Y — > R be nearly analytic. Define 
{Xx,0 x ) = RLCTx(/i, .. ■ ,f r ;p x ) and {X y ,6 y ) = RLCTy(gi, . . . ,g s ; p y ). 

By composing with projections XxY — > X and XxY — > Y , we may regard 
the fi and gj as functions analytic over XxY . Let I x and I y be ideals in AxxY 
generated by the fi and gj respectively. Recall that the sum I x + I y is generated 
by all the fi and gj while the product I x I y is generated by fcgj for all i, j. 

Proposition 3.7. The RLCTs for the sum and product of ideals I x and I y are 

RLCTxxr(ia; + I y ;(p x <Py) = (A x + A y , 6 X + 6 y — 1), 

{(A x , 6 X ) if X x < X y , 

(Xy, Oy) If X X > Xy, 

(Xr. e x + e y ) ifx x = x v . 
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Proof. Define f{x) = .!'? + ■■■ + f 2 and g(y) = g\ + ---+g 2 , and let Z X (N) and 
Z y (N) be the corresponding Laplace integrals. By Proposition [3?2l 

\ogZ x {N) = -±\ x logN+(6 x -l)\og\ogN + 0(l) 
\ogZ y {N) = -±\ y \ogN + {6 y -l)\og\ogN + 0(1) 

asymptotically. If (A, 9) = BLCTxxy{Ix + Iy\ ifxfy), then 

-|A log N + (6-l) log log JV + 0(1) 

= log J XxY e- N f^- N ^\ip x \\v v \dxdy 

- log (J x e- N ^\<p x \dx)( j Y e- N ^\cp y \ dy) 
= log Z X {N) + log Z y {N) 

= -l(K + A y ) log TV + (0 X + 6 V - 2) log log TV + 0(1) 

and the first result follows. For the second result, note that 

f(x)g(y) = fh 2 i+.f?g 2 2 + --- + fhi 

Let C, x {z) and Cy( z ) be the zeta functions corresponding to f{x) and g(y). By 
Proposition 13. 2[ (A^,^) and {X y ,9 y ) are the smallest poles of ( x (z) and ( y {z) 
while RLCTxxy (Ixly> VxVy) is the smallest pole of 

C( z ) = Ixxy (f( x )9(y)y Z/2 \<Px\\<Py\dxdy 

= {J x f(x)- z / 2 \^x\dx){J Y g( y r^\ Vy \dy) = Cx(z)( y (z). 

The second result then follows from the relationship between the poles. □ 

Our last property tells us the behavior of RLCTs under a change of variables. 
Consider an ideal / C Aw where W is a neighborhood of the origin. Let M be 
a real analytic manifold and p : M — > W a proper real analytic map. Then, the 
pullback p*I — {/ o p : f e /} is an ideal of real analytic functions on A/. If p 
is an isomorphism between M \ V(p*I) and W \ V(J), we say that p is a change 
of variables away from V(I). Let \p'\ denote the Jacobian determinant of p. We 
call (p*I; (ip o p)|p'|) the pullback pair. 

Proposition 3.8. Let W be a neighborhood of the origin and I C Aw a finitely 
generated ideal. If M is a real analytic manifold, p : M — > W is a change of 
variables away from V{I) and Ai = PI W), then 

RLCTn (/ ;¥ >) = min RLCTaUp*/; (<p o p)\p'\). 

xep- 1 ^) 

Proof. Let /i, . . . , / r generate / and let / = ff H h f' 2 r . Then, RLCTq (I; ip) 

is the smallest pole and multiplicity of the zeta function 

C(z)= f j-H-^Hldc 
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where fio C W is a sufficiently small neighborhood of the origin in £1. Applying 
the change of variables p, we have 

C(z)= [ fop(p)-^op(p)\\p'^)\dp. 



The proof of Lemma 12.41 shows that if fio is sufficiently small, there are finitely 
many points y G p _1 (0) and a cover {M y } of M = p~ 1 (Qo) such that 

c(*) = £ / f° P ^r z/2 Wop{p)\\pi{ij)\a y {p)dpL 

y JM y 

where {o~ y } is a partition of unity subordinate to {M y }. Furthermore, the /,op 
generate the pullback p*I and f ° p = (fi° p) 2 + ■ ■ ■ + {f r ° p) 2 ■ Therefore, 

BLCT Mv (f op-{ V o p)\p'\a y ) = RLCT M >*/; o p)|p'|) 

and the result follows from the two previously displayed equations. □ 

We are now ready to prove Theorem 11.21 which was inspired by Watanabe. 

Proof of Theorem ] 1. SI Let Q{uj) = Si=ife( a; ) ~ ft) 2 - The learning coefficient 
is the RLCT of the Kullback-Leibler distance K(u>), so it is enough to show that 
RLCT 0x K = RLCT 0x Q for each x e V(K) = V(Q). By CoroUaiyESI we only 
need to show that K and Q are comparable in a sufficiently small neighborhood 
of x. Now, the Taylor expansion — log< = (1 — t) + h(l — t) 2 + ■ ■ ■ implies there 
are constants c%, C2 > such that for all t near 1, 

ci(t - l) 2 < - logf + t - 1 < c 2 (t - l) 2 . (13) 

Choosing a sufficiently small W 7 ^ such that pi(uj)/qi is near 1, we have 

ci( 1) <-log 1 1 < c 2 ( 1) 

ft ft ft ft 

for all ui £ Wa;. Multiplying by qi, summing from i = 1 to k and observing that 
the pi and the q^ add up to 1, we get 



*£«(sfU)" s *< u)s ,£,(!£U 

i=l yt t=l ^ 

Again, using the fact that the are non-zero, we have 

k k 



"52 (pi(uj) - qi) <K{lo) — y2(pi(u)-qi) 

maxj ft ^ mm i 

which completes the proof. □ 
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4 Newton Polyhedra and Nondegeneracy 



Given an analytic function / G „4o(K <i ), we pick local coordinates {w\, . . . , Wd} 
in a neighborhood of the origin. This allows us to represent / as a power series 
^ Q c a uj a where ui = (uji, . . . , ujd) and each a = (oti, . . . , ad) G N d . Let [uj a ]f 
denote the coefficient c a of uj a in this expansion. Define its Newton polyhedron 
V{f) C R d to be the convex hull 

V{f) = conv {a + a' : [uj a ]f ^ 0, a' G R|„}. 

A subset 7 C V(f) is a face if there exists (3 G M. d such that 

7 = {a G T(f) : (a,0) < («',/?) for all a' 6 V(f)}. 

where ( , ) is the standard dot product. Dually, the normal cone at 7 is the set of 
all /3 G K d satisfying the above condition. Each (3 lies in the non-negative orthant 
R>o because otherwise, the linear function (-,/?) does not have a minimum over 
the unbounded set V(f). As a result, the union of all the normal cones gives a 
partition T(f) of the non-negative orthant called the normal fan. Now, given a 
compact subset 7 C M d , define the face polynomial 



Recall that / 7 is singular at a point a; G M. d if ord^/ > 2, i.e. 

We say that / is nondegenerate if / 7 is non-singular at all points in the torus 
(R*) d for all compact faces 7 of "P(/), otherwise we say / is degenerate. Now, 
we define the distance I of 7 3 (/) to be the smallest t > such that (t,t, . . . ,t) G 
V(f). Let the multiplicity 9 of £ be the codimension of the face of V(f) at this 
intersection of the diagonal with V(f). However, if I = 0, we leave 9 undefined. 
These notions of nondegeneracy, distance and multiplicity were first coined and 
studied by Varchenko [2"2"] . 

We now extend the above notions to ideals. For any ideal I C Aq, define 

V{I) = conv {a G M. d : [uo a ]f + for some / G /}. 

Related to this geometric construction is the monomial ideal 

mon(7) = (uj a : [uj a ]f ^ for some f el). 

Note that I and mon(J) have the same Newton polyhedron, and if I is generated 
by /1, . . . , f r , then mon(i) is generated by monomials uj a appearing in the /j. 
One consequence is that V(f\ + ■ ■ • + fr) is the scaled polyhedron 2V(I). More 
importantly, the threshold of / is bounded by that of mon(7). To prove this 
result, we need the following lemma. Recall that by the Hilbert Basis Theorem 
or by Dickson's Lemma [8], mon(J) is finitely generated. 
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Lemma 4.1. Given f G Ao(M. d ), let S be a finite set of exponents a of mono- 
mials UJ a which generate mon((/)). Then, there is a constant c > such that 

|/HI< c £>l a 

aes 

in a sufficiently small neighborhood of the origin. 

Proof. Let ^ Q c a u> a be the power series expansion of /. Because / is analytic 
at the origin, there exists e > such that 

^|c Q |£ ttl + -+^ <0O. 

Ct 

Now, let S = {a^\a^ 2 \ . . . , a^}. Since the monomials uj ai ) generate mon(I), 
/(&>) = w Q<1) .gi(w) + • • • + uj ais) g s (uj) 

for some power series gi(u>). Each scries <?j(w) is absolutely convergent in the s- 
ncighborhood U of the origin because / is absolutely convergent in U. Thus, the 
g%(uj) are analytic. Their absolute values are bounded above by some constant 
c in U, and the lemma follows. □ 

Proposition 4.2. Let I C Ao be a finitely generated ideal and ip : M. d — > K be 

nearly analytic at the origin. Then, 

RLCT (7;^) < RLCT (mon(J); if). 

Proof. Given / e _4o(K d ); let S be a finite set of generating exponents a for 
mon((/)). By Lemma l4.1l and the Cauchy-Schwarz inequality, there exist con- 
stants c, c' > such that 

/ 2 <( c £M Q ) 2 <c'5> 2 « 

aGS aes 

in a sufficiently small neighborhood of the origin. Therefore, if /i, . . . ,f r gener- 
ate /, then fi+. ■ -+fr is bounded by a constant multiple of the sum of squares of 
monomials generating mon(J). The result now follows from Propostion l3.4l □ 

Given a compact subset 7 C M d , define the face ideal 

Iy = if, -fel). 

The next result tells us how to compute 1 1 for an ideal I = . . . , f r ). 
Proposition 4.3. For all compact faces 7 6 "P(I), I 7 = (/i 7 , ■ • • , fr-y) ■ 
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Proof. By definition, (/i 7 , . . . , fry) C I T For the other inclusion, it is enough to 
show that / 7 G (/17, ■ • ■ , fr-y) for all / G /. First, we claim that if ui a — u> a uj a 
with a G 7 and cj" G mon(J), then w Q =1. Indeed, for all /3 6 R> normal to 
7, we have (a, (3) = (a',/3) + (a",/3>, but (a,/3) < (a',/3) so (a" J) = 0. This 
implies that a' + ka" G 7 for all integers k > 0. Since 7 is compact, a" must 
be the zero vector sow* =1. 

Now, if / G I, then / = h±fi + ■ ■ ■ + h r f r for some analytic functions 
hi, ... , h r . Clearly, / 7 = (/ii/i) 7 + • • • + {h r f r )^. By the above claim, (/ij/i) 7 = 
hiofi 7 where /i^o is the constant term in /ij. Hence, / 7 = /iio/i 7 + • • • + /i r o/r 7 G 
(/i 7 , . . . , / r7 ) as required. □ 

Remark 4.4. We now explain why we do not run into Grobner-basis issues in 
this proposition. Let j3 be a vector in the normal cone at the face 7 of V(T). 
Now, consider the weight order associated to j3, and let 'mpf be the sum of all 
the terms of / that are maximal with respect to this order [H §15]. Let in^J 
be the initial ideal in^/ = (inpf : f G I). A set of functions /1, . . . , f r G / is 
defined to be a Grobner basis for / if the initial ideal mpl is generated by the 
inpfi. Not all generating sets are Grobner bases. But in our case, the face ideal 
I-y is not the initial ideal mpl. In fact, the face polynomial / 7 is not the initial 
form inpf. For instance, if / = (x, y), f = x 2 + y 2 G /, j3 = (1, 1) G M 2 and 7 is 
the face of V(T) normal to j3, then mpf = x 2 + y 2 but / 7 = 0. □ 

Lastly, we give several equivalent definitions of nondegeneracy for ideals. If 
an ideal / satisfies these conditions, then we say that / is sos-nondegenerate, 
where sos stands for sum-of- squares. Note that the nondegeneracy of a function 
/ need not imply the sos-nondegeneracy of the ideal (/), e.g. f = x + y. 

Proposition 4.5. Let I C Aq be an ideal. The following are equivalent: 

1. For some generating set . . . , f r } for I, f 2 + --- + f 2 is nondegenerate. 

2. For all generating sets {/1, . . . , f r } for I , f 2 + • • • + f 2 is nondegenerate. 

3. For all compact faces 7 C V(I), the variety V(/ 7 ) C K d does not intersect 
the torus (R*) d . 

Proof. Let fx, . . . , f r generate / and let / = ff + • ■ ■ + f 2 . If 7 is a compact 
face oiV(I), then the set (27) is a compact face of V(f) = 2V(I). Furthermore, 
/(2 7 ) = /i 7 + • • • + fr-y and 

d/(27) _ 9 f <9/ l7 df ri 

a — ^/l7"o 1 <~ Z J r 1~a ■ 

OCdi OUJi OUJi 

Now, f 2 + ■ ■ ■ + f 2 = if and only if /i 7 = • • • = f ry = 0. It follows that / is 
nondegenerate if and only if V((/i 7 , . . . , f ri )) n {W) d = V(/ 7 ) n (R*) d = for 
all compact faces 7 C V(I). This proves (1) <^> (3) and (2) <^> (3). □ 

Remark 4.6. After finishing this paper, we discovered another notion of non- 
degeneracy for ideals of complex formal power series due to Saia [19] , which was 
shown to be equivalent to the complex version of Proposition 14. 5f 3) [3 §2]. 
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We recall some basic facts about toric varieties. We say a polyhedral cone a is 
generated by vectors V\, . . . , vu £ M d if er = ^i v i : > 0}. If a is generated 
by lattice vectors Vi G Z d , then a is rational. If the origin is a face of er, then 
o~ is pointed. A ray is a pointed one-dimensional cone. Every rational ray has 
a lattice generator of minimal length called the minimal generator. Similarly, 
every pointed rational polyhedral cone a is generated by the minimal generators 
of its edges. If these minimal generators are linearly independent over IR, then 
a is simplicial. A simplicial cone is smooth if its minimal generators also form 
part of a Z- basis of Z d . A collection T of pointed rational polyhedral cones in 
W l is a fan if the faces of every cone in T are in T and the intersection of any 
two cones in T are again in T. The support of T is the union of its cones as 
subsets of R d . If the support of T is the non-negative orthant, then T is locally 
complete. If every cone of T is simplicial (resp. smooth), then JF is simplicial 
(resp. smooth). A fan T\ is a refinement of another fan J~2 if the cones of T\ 
come from partitioning the cones of T<x- See [TU] for more details. 

Given a smooth simplicial locally complete fan J 7 , we have a smooth toric 
variety P(J r ) covered by open charts U a ~ K d , one for each maximal cone a of 
T . Furthermore, we have a blow-up pjr : P(J r ) — > M. d defined as follows: for each 
maximal cone a of T minimally generated by V\, . . . , Vd with Vi = (vn, . . . , Vid), 
we have monomial maps p a : U a -> R d on the open charts. 

Oui,...,^) h-> (wi,...,^) 

, , _ ,,t>ll ,,t>21 ,,"dl 

uji — fa n 2 ■■■ p d 
, i — ,, Vi2 ,, V22 ,, Vd2 

UJ 2 — fJ-i fa ■■■ H d 

Let w = i> CT be the matrix ) where each minimal generator Vi forms a row of 
v. We represent the above monomial map by oj = \i v . If Ui+ represents the i-th 
row sum of v, the Jacobian determinant of this map is 

(dett;)^- 1 -^" 1 - 

We are now ready to connect these concepts. The next two theorems are due 
to Varchenko, see [22] and [21 §8.3]. His notion of degeneracy is weaker than ours 
because it does not include the condition / 7 = 0, but his proof [21 Lemma 8.9] 
actually supports the stronger notion. The set up is as follows: suppose / is 
analytic in a neighborhood W of the origin. Let T be any smooth simplicial 
refinement of the normal fan F{f) and pjr be the blow-up associated to J- '. Set 
M = p'j: 1 {W). Let I be the distance of V(f) and 9 its multiplicity. 

Theorem 4.7. If f is nondegenerate, then (M,W, pr) desingularizes f at 0. 

Theorem 4.8. Suppose (M, W, pjr) desingularizes f atO. If f has a maximum 
or minimum at 0, then RLCTq / = {l/l, 0). 
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We extend Theorem l4.8l to compute RLCTo(/; ui T ) for monomials w r . Given 
a polyhedron P(f) C M d and a vector r = (n, . . . , Ty) of non-negative integers, 
let the t -distance l T be the smallest t > such that t(n + 1, . . . , + 1) G ^(Z) 
and let the multiplicity # r be the codimension of the face at this intersection. 

Theorem 4.9. Suppose (M,W, pp) desingularizes f atO. If f has a maximum 
or minimum at 0, then RLCTo(/;w T ) = (l/l T ,9 T ). 

Proof. We follow roughly the proof in §8] of Theorem l4.8l Let a be a maximal 
cone of T . Because T refines T{j~) , cr is a subset of some maximal cone a' of 
J-(f). Let a G M. d be the vertex of P{f) dual to cr'. Let v be the matrix whose 
rows arc minimal generators of a and p the monomial map p i— > p v . Then, 

f{uy*u T dw = f(p(v)r z p(py\ P '(p)\dp 

= (detBjsM-v^r -1 • • • ^ d+_1 

for some function g(p). Because / has a maximum or minimum at 0, this ensures 
that g(p) ^ on the affinc chart U a . Thus, for the cone cr, 

(Ao-, da) = (min S, #min S), S= / ^' T + 1 ) : 1 < i < d\ 

where r + 1 = (n + 1 , . . . , + 1) . We now give an interpretation for the elements 
of S. Fixing z, let P be the affine hyperplane normal to Vi passing through a. 
Then, (t)i,a)/{vi,T + 1) is the distance of P from the origin along the ray 
{t(r + 1) : t > 0}. Since RLCT (/;u; T ) = min cr (A CT , 6 a ), the result follows. □ 

Remark 4.10. After finishing this paper, the author discovered that a similar 
result was proved by Vasil'ev [23] for complex analytic functions. □ 

Monomial ideals play in special role in the theory of real log canonical thresh- 
olds of ideals. The proof of this next result is due to Piotr Zwicrnik. 

Proposition 4.11. Monomial ideals are sos-nondegenerate. 

Proof. Let / = ff + ■ ■ ■ + f% where fx,...,f r are monomials generating /. For 
each face 7 of V(I), / 7 is also a sum of squares of monomials, so f 1 does not 
have any zeros in (R*) d and the result now follows from Proposition 14.51 ^3). □ 

Our tools now allow us to prove Theorem 11.31 As a special case, we have a 
formula for the RLCT of a monomial ideal with respect to a monomial ampli- 
tude function. The analogous formula for complex log canonical thresholds of 
monomial ideals was discovered and proved by Howald |15) . 

Proof of Theorem \1.3[ If the ideal / is sos-nondegenerate, then the equality fol- 
lows from Proposition 14. 51 Theorem 14.71 and Theorem 14.91 For all other ideals, 
the inequality is the result of Proposition 14.21 and Proposition 14.111 □ 
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Remark 4.12. Define the principal part f-p of / to be c a^> a where the sum 
is over all a lying in some compact face 7 of V(f). The above theorems imply 
that if / is nondegenerate, then RLCTo / = RLCTo fp. However, the latter is 
not true in general. For instance, if / = (x + y) 2 + y 4 , then fp = (x + y) 2 but 
RLCT / = (3/4, 1) and RLCT fo = (1/2, 1). 

Our first corollary shows that the BIC is a special case of Theorem ll.il 

Corollary 4.13. Iff 6 Ao(R d ) has a local minimum at the origin with /(O) = 
and its Hessian {d 2 f / duJidtUj) is full rank, then RLCTo / = (d/2, 1). 

Proof. Because its Hessian is full rank, there is a linear change of variables such 
that / = to 2 + ■ ■ ■ + bj 2 d + O(oj 3 ). Thus, / is nondegenerate and the Newton 
polyhedron V(f) has distance I = 2/d with 9 = 1. □ 

Corollary 4.14. Let I be the ideal (fx, ... , f s ), and suppose the Jacobian matrix 
(dfi/dwj) has rank r at 0. Then, RLCT / < (|(r + d), 1). 

Proof. Because the rank of (dfi/duij) is r, there is a linear change of variables 
such that the only linear monomials appearing in I are oj\, . . . , ui r . It follows 
that V(I) lies in the halfspace a\ + ■ ■ • + a r + |(a r +i + • • • + a<j) > 1 and its 
distance is at least l/(r + ^p) = 2/(r + d). □ 

5 Applications to Statistical Models 

In this section, we use our tools to compute the learning coefficients of a naive 
Bayesian network M. with two ternary random variables and two hidden states. 
It was designed by Evans, Gilula and Guttman [3] for investigating connections 
between the recovery time of 132 schizophrenic patients and the frequency of 
visits by their relatives. Their data is summarized in the 3x3 contingency table 



which we store as a 3x3 matrix q of relative frequencies. The model is given by 

p : fl = AixA 2 xA 2 xA2xA2 — > A 8 

u = (t,ai,a 2 , 61,62, ci,C2,di,d 2 ) ^ (pij) 
Pij = tafij + (1 - t)cidj, i,j £ {1, 2, 3} 

where 03 = 1 — 01 — 02, a = (ai, o 2 , 03) € A 2 and similarly for b, c and d. Hence, 
a 3x3 matrix in the model is a convex combination of two rank one matrices, 
so it has rank at most two. The marginal likelihood integral 



Visited regularly 
Visited rarely 
Visited never 
Totals 



2<V<10 10<V<20 20<V 

43 16 3 

6 11 10 

9 18 16 

58 45 29 



Totals 
62 
27 
43 
132 



1 = 
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of the data was computed exactly by Sturmfels, Xu and the author [15] . 

We now estimate this integral using Watanabe's asymptotic formula for the 
log likelihood integral in Theorem ll.il We assume that the data q was generated 
by some true distribution q = (g^) 6 R 3x3 in the model. Ideally, we want q to 
be equal to the matrix q of relative frequencies, but in general, the data q rarely 
lies in the model. In this example, the matrix 9 is not in the model because it 
is full rank. However, we should be able to find a distribution q in the model 
that is close to 9, because in practice, we want to study models which describe 
the data well. A good candidate for q is the maximum likelihood distribution. 
Using the EM algorithm, this distribution is 

1 ( 43.00153927 15.99813189 3.000328847 \ 
q = — 5.979732739 11.12298188 9.897285383 
132 \ 9.018728012 17.87888620 16.10238577 / 

which comes from the maximum likelihood estimate 



t = 0.5129202328 
(oi,a 2 ) = (0.09139459898,0.3457903589), 
(61,62) = (0.1397061214,0.4386217768), 
(ci,c 2 ) = (0.8680689680,0.05580725171), 
(d u d 2 ) = (0.7549807403,0.2380125694). 

Note that the ML distribution q is indeed very close to the data 9. 

Our next theorem summarizes how the asymptotics of log Z(N) depend on 
q. Let Si denote the set of rank i matrices inp(fi). Let S21 be the set of matrices 
in S2 where there are permutations of the rows and of the columns such that 
911 = and 912,921,922 are all non-zero. Let S22 be the subset of S2 where, up 
to permutations, qn = 922 = and 912,921 are non-zero. Before we prove this 
theorem, let us apply it to our statistical problem. Using the exact value of I 
computed by Lin-Sturmfels-Xu |18| . we have 

(logI) cxact = -273.1911759. 

Meanwhile, if the BIC was erroneously applied with the dimension d = 9 of the 
parameter space, we would get 

(logI) B ic = -280.7992160. 

On the other hand, by calculating the real log canonical threshold of the poly- 
nomial ideal (p(uj) — 9), we find that the learning coefficient of the model at the 
ML distribution 9 is (A, 8) = (7/2, 1). This gives us the approximation 

(logI) RLCT « -275.9164140 

which is closer than the BIC to the exact value of log I. 
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Theorem 5.1. The learning coefficient (X,0) of the model at q is given by 



(X,0) 



(5/2,1) ifqeS u 

(7/2,1) ifq£S 2 \(S 21 US 22 ), 

(4,1) ifqeS 21 \S 22 , 

(9/2,1) ifqES 22 . 



Therefore, asymptotically as N — > oo 



log Z(N) = Qij lo S Qij - A log TV + (6» - 1) log log N + m 



where q is the matrix of relative frequencies of the data and t]n is a random 
variable whose expectation E[?yjv] converges to a constant. 

We postpone the proof of this theorem to the end of the section. Let us begin 
with a few remarks about our approach to this problem. Firstly, Theorem 11.21 
states that the learning coefficient (A, 9) of the statistical model is given by 



where V is the fiber p^ 1 (q) = {w £ !! : — <?} over q. Instead of focusing on 
a fixed q and its fiber V, let us vary the parameter ui* over all of f2. For each 
uj* 6 il, we translate f2 so that w* is the origin and compute the RLCT of the 
ideal {p(uj + uj*) — p(ui*)). This is the content of Proposition 15.31 The proof of 
Theorem 15.11 will then consist of minimizing these RLCTs over the fiber V for 
each q in the model. 

Secondly, in our computations, we will often be choosing different generators 
for our ideal and making appropriate changes of variables. Generators with few 
terms and small total degree will be highly desired. Another useful trick is to 
multiply or divide the generators by functions f(uj) satisfying /(0) ^ 0. Such 
functions are units in the ring Aq of real analytic functions so this multiplication 
or division will not change the ideal generated. This next lemma also comes in 
handy in dealing with boundary issues. 

Lemma 5.2. Let f2 C ■ ■ ■ , Xd) S K d } be semianalytic. Let I be a monomial 
ideal and ip a monomial function in X\, . . . ,x r . If there exists a vector £ £ M. d ~ r 
such that rti x Sl 2 C SI for sufficiently small e where 

Si! ={(x u ...,x r ) 6 [0,e] r } 

2 = {(x r+u ...,x d ) = t(Z + £') where t G [0,e]^' 6 [-e,^}, 
then RLCTn (I; <p) = RLCT (7; ip). 

Proof. Because / and \<p\ remain unchanged by the flipping of signs of x\, . . . , x r , 
their threshold does not depend on the choice of orthant, so RLCTji^/; ip) = 
RLCTo(/;^). The lemma now follows from Proposition 13.71 and the fact that 
the threshold of the zero ideal over the cone neighborhood £l 2 is (oo, — ). □ 



(2A,0) 



min RLCTn^, (p(w) - q) 
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To state our next proposition, let us define the following subsets of f2. These 
subsets stratify il according to the real log canonical threshold in the manner 
described in Coniecturc 12.81 



a u = {u* g n : t* e {o, 1}} 

On = Kefiif^ {0, 1}} 

Q m o = {w* G Q m : a* = c*, b* = d*} 

n m0kl = {u* G ft m0 : #{i : a* = 0} = fc, #{i : 6* = 0} = 1} 

fi m i = {w* G f2 m : (6* 7^ d* , a* = c*) or (a* =/= c* ,b* = d*)} 

Omo = {w* G O m i : (a* = c*,3i of = 0) or (b* = d*, 3 i 6f = 0)} 

n m2 = {u* en m :a* jtc*,b* jtd*} 

n m 2ad = K G n m2 : 3 i, j a* = d*j = 0, c? ^ 0, 6} ^ 0} 

^ m2bc = K G O m 2 : 3i,i 6J = c* = 0, d* ^ 0, a* ^ 0} 



Proposition 5.3. Given u* G O, Zet 7 6e £/ie idea? (p(w + w*) — Then, 



(6, 1) i/w* G O m i \ n m io, 

(7, 1) i/w* G fi m io, 

(7, 1) ifu* G n m2 \ n m 2i, 

(8, 1) ifu* e fi m2 i \ £! m22 , 

(9, 1) i/w* G fi m 22- 



Remark 5.4. In the proof, we compute real log canonical thresholds by hand to 
demonstrate how the various properties from Section[3]can be applied. At points 
in the proof where RLCTs of monomial ideals are required, the Singular library 
from Section [T] can also be used. It is our hope that some day the computation 
of learning coefficients for statistical models will be fully automated. 

Proof. The ideal 7 is generated by gij = f%j{u + u*) — fij(u*) where 



and ao = bg = cq = do = 1. One can check that 7 is also generated by gio,<?20, 





RLCTo 7 = { 



r (5,1) 

(6,2) 
(6,1) 
(7,2) 
(7,1) 
(8,1) 



ifu* G fl u , 

if LO* G f2 m 000, 

if LO* 6 fimOlO U S2 m ooi U ^m020 U ^m002, 

ifuj* G fimoil, 

if LO* G fi m 012 U Si m 021, 

if LO* £ £l m 022, 



/y = tajbj + (1 - £)ci4j, i, j G {0, 1,2} 
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3oi,. 902, and - (dj + d*)g i0 - (oj + a*)g 0j ,i,j € {1, 2} which expand to give 

ci(i* -t) + ai(t*+t)+tu* 
C2(tl-t) + a 2 (t* Q +t)+tu* 2 
di{t*i - t) + hit* + t) + tv{ 
d 2 {t\ - t) + b 2 (t* + t) + tv* 2 

axdi — a-it^v* + d±t*u* 

a\d 2 — aitgv 2 + d 2 t*u* 

a 2 di — a 2 tgvl + d\t\u 2 

a 2 d 2 — a 2 t^v 2 + d 2 t\u* 2 

where t% = t*,t\ = 1 — t*, u* = a* — c* , v* = b* — d* . Note that ( a i + a i ) = 1 
and Y^, 0-1 = 1 so X) a i = an< ^ similarly for 6, c, d. Also, ^ u* = ^ a* — c* = 0. 
The same is true for u*. We now do a case-by-case analysis. 

Case 1: u* € f2 TO . 

This implies ^ and t* =/= 0. Since the indeterminates &i, 6 2 , ci, c 2 appear 
only in the first four polynomials, this suggests the change of variables 

a= {c[-tu* — Oi(*5 + *))/(** — *)) * = i,2 

fci= (6{-t« i *-*(*i-*))/(*o+*). i = 1 > 2 

with new indeterminates i, oi, 02, &' 2 , c i, c 2, ^1, d 2 - In view of Proposition [3^1 
the Jacobian determinant of this substitution is a constant. 

Case 1.1: lo* e fi m i. 

This implies u* ^ 0,v* = or u* = 0, v* ^ 0. Without loss of generality, we 
assume v* — 0, u\ > and substitute 

= (d'i + a\t* v*)/ {t*u* + 01), i = 1,2. 

The resulting pullback ideal is 6 2 , c' 1; c 2 , c? 2 ). If cj* lies in the interior of 
0, we use either Newton polyhedra or Proposition 13 . 71 to show that the RLCT 
of this monomial ideal is (6, 1). If 10* lies on the boundary of f2, the situation 
is more complicated. Since we are considering a subset of a neighborhood of 
ui* , the corresponding Laplace integral from Proposition 13.2a is smaller so the 
threshold is at least (6, 1). To compute it exactly, we need blowups to separate 
the coordinate hypcrplancs and the hypersurfaces defining the boundary. 

Because — u\ = u 2 + u^, we cannot have u 2 = = 0. Suppose u 2 ^ and 
U3 7^ 0. We consider a blowup where one of the charts is given by the monomial 
map t = s, ai = sa^, c[ = rs, c' 2 = rsc 2 , b[ = rsb" , d! i = rsd". Here, the pullback 
pair is ((rs);r 5 s 8 ). Now, we study the inequalities which are active at ui*. For 
instance, if b\ = 0, then to* lies on the boundary defined by < b\ + b\. After 
the various changes of variables, the inequalities are as shown below, where 
63 = —6" — b 2 and similarly for c 3 ',c?3 and a' 3 . Note that the inequality for 
a* = is omitted because a* = implies u\ = — c\ < 0. Similar conditions on 
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the u*,v* hold for the other inequalities. 



b* 


= 


< rs{b'( - 


- 


+ *a'i))/(*o + *) 






d* 


= 


< rsd"/{t*u* + sa 


i) 






Cl 


= 


< s(-ul - 


K»i(*5 + 


s) + r)/(tj - s) 






r* 


= 


< s(— " 


ha 2 (^ + 


«) + - *) 


* 

"2 


> 


r* 


= 


< s(— U3 - 




s)-r-rdL)/<&-8) 


"3 


> 




= 


< sa' 2 






u* 2 


< 


at 


= 


< sa 3 






«3 


< 



In applying Lemma 15.21 the choice of coordinates is important. For instance, if 
^2 = ^3 = 0) we choose coordinates b 2 and 63 and set b" = —b 2 — 63. The same 
is done for the d'(. The pullback pair is unchanged by these choices. Now, with 
coordinates (r, s) and (6^ , 6" 2 , d"^ , <i" 2 , c' 2 ' , a' 2 , a' 3 ), we apply the lemma with the 
vector £ = (2, 2, u\, u{, 1, 1, 1), so the threshold is RLCT (rs; r 5 s s ) = (6, 1). 

Now, if only one of u 2 ,u 3 is zero, suppose u 2 — 0, U3 ^ without loss of 
generality. If a 2 — c 2 7^ 0, then the arguments of the previous paragraph show 
that the RLCT is again (6, 1). If a 2 = c 2 = 0, wc blow up the origin in R 7 and 
consider the chart where 0,2 = s, c[ = ,sc'/, b' i = sb'(, d' t = sd" . The pullback pair 
is ((sb'{, sb 2 , sc", sc 2 , sd'{, sd 2 ); s 6 ). The active inequalities for a 2 = c 2 = are 

c* 2 = 0: 0< S (4-tl + t)/{t\-t) 
a* 2 = 0: < s. 

Near the origin in (s,b'{,b 2 ,c'{,c 2 ,d'{,d 2 ) G R 7 , these inequalities imply s = 
so the new region M. defined by the active inequalities is not full at the origin. 
Thus, we can ignore the origin in computing the RLCT. All other points on the 
exceptional divisor of this blowup lie on some other chart of the blowup where 
the pullback pair is (s; s 6 ), so the RLCT is at least (7, 1). In the chart where 
c-2 = s, c\ = sc'l,a-2 = sa 2 ,^ = sfe",d' ; = sd", we have the active inequalities 
below. Note that C3 ^ because M3 = — u* < 0. 

b* = : < S {b'( - d'!{t\ - t)/(tM - (sa> 2 + a s ))/(t* + t) 

d* = : < sd'{/{tlu\ - (sa' 2 + 03)) 

c* = : < (sc'{ - tut + ( sa 2 + a 3)(^o + *))/(** - *) 

4 = 0: < *(i - «^(ts + *))/(*! - *) 

a* 2 = : < sa' 2 
«3 = : < 03 

Again, choosing suitable coordinates in the b" and d" , we find that the RLCT 
is (7,1) by using Lemma 15.21 with £ = (2, 2, u*, it£, 1, 1, 1, — 1) in coordinates 
(^,6^,4,4,a 2 ,a3,<,t). 

Case 1.2: uj* g fi m 2- 

This implies u* ^ Q,v* ^ 0. Without loss of generality, suppose that u\ ^ 0. 
If lo* G ^m2i, we further assume that a\ = d* = 0, u* 7^ 0, v* =/= 0. Substituting 

(4 + aitS<)/(ai + iM), * = 1,2 

fl 2 = ( a 2 + a l u 2)/ u l! 
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the pullback ideal is (a' 2 , b[,b 2 , c^c^, d[, d 2 ) so the RLCT is at least (7, 1). Note 
that di = (a' 2 w* + a\u*) / 'u\ for i = 1,2,3 where w* = 0, 1, —1 respectively. If lu* 
is not in Q m 2i, we consider the blowup chart a' 2 = s, b\ — sb", = sc", d! i = sd'l . 
The active inequalities are as follows. The symbol v— denotes v* < 0. 

b* = 0: 0<K' -tv* -{sd'l + a 1 t* v*)(tl~t)/(t* 1 ul + a 1 )}/(t* + t) v- 

c*=0: Q<[sc'!-tu*-(sw*+a^){tl+t)/ul]/{tl-t) u+ 

a* = : < (sw* + a\u*)/u\ u— 

d* = : < {sd'! + a^SO/k"! + a i) «+ 

The crux to understanding the inequalities is this: if a* = d* = 0, u* =/= Q,v* 0, 
the coefficient of a\ appears with different signs in the inequalities for a* = 
and d* = 0. This makes it difficult to choose a suitable vector £ for Lemma I5T21 
Similarly, if b* — c* = 0, v* ^ 0, u* ^ 0, the coefficient of u\t+t^ai appears with 
different signs. Fortunately, since uj* f2 TO 2i) we do not have such obstructions 
and it is an easy exercise to find the vector £. Thus, the RLCT is (7, 1). 

If lj* e fi TO 2i \ ^m22, we blow up a\ — s , a' 2 — sa 2 , = sb'l,Ci = sc",di = 
sd'l ■ The active inequalities for a* = d* = imply that the new region Ai is 
not full at the origin of this chart. Thus, we shift our focus to the other charts 
of the blowup where the pullback pair is (s; s 7 ), so the RLCT is at least (8, 1). 
In the chart where a' 2 = s, a\ = sa' 1; b[ = sb", Cj = sc'l , di = sd'l i we d° not have 
obstructions coming from any b* = c* =0, v* ^ 0, u* ^ so it is again easy to 
find the vector £ for Lemma I5T21 The threshold is exactly (8, 1). 

If u>* € r2 m 22; consider the following two charts out of the nine charts in the 
blowup of the origin in R 9 . 

Chart 1: a\ = s, t = st', a 2 = sa 2 , b\ = sb" , ci = sc'l , d{ = sd'l 
Chart 2: t = s, a\ = sa'^, a' 2 = sa 2 , b\ = sb", c; = sc'l , di = sd'l 

The inequalities for a* = d* = 0, u* ^ 0, v* ^ and b* = c* = 0, v* ^ 0, u* ^ 
imply that the new region Ai is not full at points outside of the other seven 
charts, so we may ignore these two charts in computing the RLCT. Indeed, for 
Chart 1, the active inequalities 

at = : < s(a'iw* + u*)/u\ u- 
d*=0: 0<s(dH+t* v*)/(ttut + s) v+ 

tell us that a 2 or d 2 must be non-zero for Ai to be full. In Chart 2, suppose Ai 
is full at some point x where a 2 = b'{ = b 2 = c'{ = c 2 = d!{ = d 2 — 0. Then, 

a* = : < s(a 2 w* + a[u*)/ul u— 
d*=0: 0< s(d'l +a' 1 t* vf)/(t* 1 ul + sa' 1 ) v+ 

imply that a[ = at x. However, if this is the case, the inequalities 

b* = : < s[b>! - v* - (d'l + a'.tlvtm - s)/{t\ul + sa[)}/(t* + s) v- 
< = 0: Q<s[c'l-u*-{a'iw*+a' 1 u*){tl + s)/u*]/(tl-s) u+ 
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forces b'l or c" to be non-zero for some i, a contradiction. Thus, we shift our focus 
to the other seven charts where the pullback pair is (s; s 8 ) and the RLCT is at 
least (9, 1). In the chart for a 2 = s,a\ = sa[,t = st' , 6- = sb", c- = sc", d[ = sd" . 
note that we cannot have both a*. = and — because we assumed = 0. 
It is now easy to find the vector £ for Lemma T5. 21 so the threshold is (9, 1). 

Case 1.3: lo* £ £! m0 . 

This implies u* = v * = for all i. The pullback ideal can be written as 

{b'x^c'^c'z) + (a 1 ,a 2 )(d 1 ,d 2 ) 

whose RLCT over an interior point of H is (6, 2) by Proposition ^. 71 This occurs 
in n m ooo where none of the inequalities are active. Now, suppose the only active 
inequalities come from a* = c* = 0. We blow up the origin in {(ai, c[) el 2 ). In 
the chart given by a\ = a' 11 = a'ic'{, the new region M is not full at the origin, 
so we only need to study the chart where c[ = c", a\ = c![a' x . The pullback pair 
becomes ((c'{) + (b' l7 b' 2 , c' 2 ) + {a 2 )(d\ 1 d 2 ); c"), and a simple application of Lemma 
15.21 and Proposition 13 . 71 shows that the threshold is (6, 1). 

In this fashion, we study the different scenarios and summarize the pullback 
pairs and thresholds in the table below. 



Inequalities Pullback pair RLCT 









((b' 1 ,b 2 ,c' 1 ,c 2 ) + 


(ai,a 2 )(di 1 d 2 )\ 


1) 


(6,2) 


a* 


= 




({b' 1 ,b' 2 ,c'{,c' 2 ) + 


{a 2 )(di,d 2 ); 


c'D 


(6,1) 


"l 




= 


((b'{,b' 2 ,c'{,c 2 ) + 




b'{c'{) 


(7,2) 


a* 


= 1 










(6,1) 


a* 


= 


= 


((b>{,b 2 ,c>>,4); 




b'{c'{4) 


(7,1) 


a* 


= 


= 1 






b»b>ic>>4) 


(8,1) 



For example, the case 03 = C3 = 1 corresponds to a{ = a 2 = c* = c 2 = 0. Here, 
we blow up the origins in {(ai,c[) £ M 2 } and {(a 2 , c' 2 ) £ M 2 }. As before, we can 
ignore the other charts and just consider the one where a\ = c^a\, c[ = c", a 2 = 
c' 2 W 2 , c' 2 = 4. The pullback pair is {{c'D + (c' 2 ') + (6i, b' 2 ), c'{4). If b* ^ for all 
i, the RLCT is (6, 1) by Lemma IS~2l and Proposition 13. 71 

Case 2: uj* £ Q, u . 

Without loss of generality, assume t* = and substitute 

c t ={c' l -t{a l +u*))/{l-t) t = l,2 
Ck = {d' i -t{b i + v* i ))/{l-t) i = 1,2. 

The pullback ideal is the sum of (c[, c' 2 , d[, d' 2 ) and 

{t)(a 1 +u* 1 ,a 2 + u*)(b 1 +v* 1 ,b 2 + v* 2 ). 

Since c 3 = — c[ — c' 2 and similarly for the dj, Oj, 6j, u* and v* , it is useful to write 
this ideal more symmetrically as the sum of (c^c^c^), (d'i,d 2 ,d' 3 ) and 

(t)(ai + u{,a 2 + u* 2 , a 3 + ul){bx + v{ , b 2 + v 2 , b 3 + v^). 
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Meanwhile, the inequalities are 



* 


= : 


< en 












= : 


< (c< 


- t{a.i - 


-«?))/(! 


~t) 




b* 


= : 


< bj 












= : 


< (d! 3 


t(b 3 - 


f «;))/(i 


~t) 


«*>o 



We now relabel the indices of the cij and c' ; , without changing the 6j and d'j , so 
that the active inequalities arc among those from a\ — 0, a* 2 = 0, = 0, c| 2 = 0. 
The 6j and d'j are thereafter also relabeled so that the inequalities come from 
6* = 0, b* 2 = 0, rf^ = 0, d* 2 = 0. Wc claim that the new region M contains, for 
small e, the orthant neighborhood 

{(ai,a2,bi,b2,c i - L ,c i2 ,dj 1 ,dj 2 ,-t) e [0,e] 9 }. 

Indeed, the only problematic inequalities are 

c*=0: < (c 3 -t(-Oi -a 2 + <))/(l - *) 1*3 = 
d*=0: < - £(-6i - 6 2 + «*))/(! -t) "3=0. 

However, these inequalities cannot occur because for instance, = and €3=0 
implies = 0, a contradiction since the a, were relabeled to avoid this. Finally, 
the threshold of (t) is (1, 1) while that of (ai a 2 + "2) an d (&i +"*! ^2 + "2) 
are at least (2, 1) each. By Proposition [3771 the RLCT of their product is (1, 1) 
and that of the pullback ideal we were originally interested in is (5, 1). □ 

Proof of Theorem \5.1l Given a matrix q = (qij), the learning coefficient (\,0) 
of the model at q is the minimum of RLCTs at points uj* S Q where p(w*) = q. 
The theorem then follows from Proposition 15. 31 Theorem 1 1.1 1 and the claims 

p(fi„) = Si, p(£l m o) C Si, P(ft m 0) C Si, P(£l m 2l) = S 2 1, P(^m22) = S 2 2- 

The first three claims are trivial. The proofs of the last two claims are similar, so 
we only show p(fl m 2 2 ) = S22- First, it is easy to check that p(Cl m 22) C S^- Now, 
if p(w*) = q E S22, then q n = t*a*b* + (1 - t*)c\d\ = implies that a* = or 
b* = because the parameters are positive. Without loss of generality, suppose 
a\ = 0. Because gi 2 7^ 0, we have c\ 7^ which leads us to d* = and b\ 7^ 0. 
The condition q 22 = then shows that b* 2 = c* 2 = 0,a2 ^ 0, d| ^ 0. Therefore, 
6 O m2 2 and p(ri m2 2) D S 22 - □ 
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